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DEDICATORY  FOREWORD 


THIS  PAPEE  IS  KEVEKENTLY  DEDICATED  TO  THE  MEMOEY  OF 

Geoege  UdnyYule,  1871-1951 

When  this  paper  was  half  written,  the  authors  learned  of  the  death  of  George 
Udny  Yule.  His  death  closed  the  early  epoch  of  the  development  of  the  theory  of 
statistics — an  epoch  marked  by  the  names  of  F.  Y.  Edgeworth,  W.  S.  Gosset 
(“Student”),  Major  Greenwood,  Karl  Pearson,  W.  F.  Sheppard,  and  Yule,  himself. 

The  contributions  of  Yule  were  numerous  and  were  concerned  with  a  number  of 
aspects  of  statistical  research,  frequently  in  advance  of  his  contemporaries.  For 
many  years,  Yule  was  best  known  as  the  author  of  the  book,  An  Introduction  to  the 
Theory  of  Statistics.  First  published  in  1911,  this  book  has  had  fourteen  English 
editions  (since  1937,  revised  editions  have  appeared  under  the  joint  authorship  of 
G.  U.  Yule  and  M.  G.  Kendall)  and  for  a  long  time  was  the  only  worthwhile  book 
on  statistics;  several  translations  have  also  been  published. 

In  more  recent  times,  owing  to  the  number  of  entirely  new  developments,  the 
relative  importance  of  the  Introduction  decreased  and  the  name  of  George  Udny 
Yule,  as  its  author,  began  to  slip  into  oblivion.  At  the  same  time,  however,  his  name 
began  to  appear  in  the  literature  in  various  other  connections — particularly  in 
connection  with  what  is  now  known  as  the  theory  of  stochastic  processes.  Although 
by-passing  the  Introduction ,  modern  statistical  thought  eventually  caught  up  with 
a  number  of  fruitful  ideas  published  by  Yule  in  the  1920’s.  At  the  time  these  ideas 
went  hardly  noticed  but  now  proved  aere  perennius .  Yule’s  own  attitude  towards 
mathematical  statistics  was  distinctly  nonmathematical,  and  it  is,  therefore,  re¬ 
markable  that  his  nonmathematical  writings  should  now  become  a  source  of  inspi¬ 
ration  in  the  mathematical  theory  of  stochastic  processes.  To  us,  this  is  the  finest 
possible  testimony  to  Yule’s  great  scientific  talent,  and  it  is  hoped  that  the  frequent 
references  to  Yule  by  such  authors  as  William  Feller,  Hermann  Wold,  and  others 
may  have  cheered  the  aged  scholar  during  the  last  years  of  his  life. 

In  1931  Yule  felt  that  he  was  too  old  to  hold  the  position  of  Reader  at  Cambridge 
University  and  retired.  At  the  same  time  he  felt  young  enough  to  learn  to  fly. 
Accordingly,  he  went  through  the  intricacies  of  training,  got  a  pilot’s  license,  and 
bought  a  plane.  Unfortunately  a  heart  attack  cut  short  both  the  flying  and,  to  a 
considerable  degree,  his  scholarly  work. 

Most  of  Yule’s  active  life  (roughly  from  1897  to  1938)  coincided  with  a  tumul¬ 
tuous  period  in  the  development  of  mathematical  statistics,  when  true  scholarly 
achievements  were  accompanied  by  outbursts  of  personal  animosities,  noisy  self- 
glorifications,  and  bitter  disputes.  Yule  was  an  active  scholar  and  it  was  natural  for 
him  to  be  under  attack  from  time  to  time.  However,  to  our  knowledge,  nothing 
Yule  ever  wrote  conflicted  with  the  dignity  of  the  spirit  of  research,  and  his  name 
enters  history  unmarred. 

The  range  of  Yule’s  scientific  contributions  was  very  broad.  Among  other  things, 
he  did  pioneering  work  on  accident  proneness,  in  collaboration  with  Greenwood.  In 
fact,  the  first  line  of  the  Introduction  of  the  present  paper  contains  a  reference  to 
their  fundamental  memoir.  It  is  fitting  that  this  paper  be  dedicated  to  the  memory 
of  George  Udney  Yule. 
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CONTRIBUTION S  TO  THE  THEORY  OF 
ACCIDENT  PRONENESS 


I.  AN  OPTIMISTIC  MODEL  OF  THE  CORRELATION  BETWEEN 
LIGHT  AND  SEVERE  ACCIDENTS 

BY 

GRACE  E.  BATES  AND  JERZY  NEYMAN 


1.  Introduction.  Since  the  pioneer  work  of  Greenwood  and  Yule  [l]1  and  of  Miss 
Newbold  [2],  the  following  assumptions  regarding  accident  proneness  are  customar¬ 
ily  made : 

a)  To  each  individual  exposed  to  a  certain  system  of  risks  and  to  each  kind  of 
accident  there  corresponds  a  Poisson  frequency  function, 

(1)  Px(k\\)=e-^_ 

of  the  number  X  of  accidents  of  this  particular  kind  incurred  by  this  individual  per 
unit  time. 

b)  The  value  of  the  parameter  X  varies  from  one  individual  of  the  population  to 
another  and  characterizes  his  specific  accident  proneness. 

c)  More  specifically,  it  is  frequently  assumed  that  for  an  individual  randomly 
selected  from  a  given  population  exposed  to  a  fixed  system  of  risks,  the  parameter  X 
is  a  particular  value  of  a  random  variable  A  with  probability  density  function 


(2) 


vx  (*) 


I'(a) 


a— 1 

x  e 


-fix 


where  the  constants  a  >  0  and  j8  >  0  depend  on  the  population  considered  and 
on  the  kind  of  accidents. 

d)  It  is  customary  to  assume  that,  although  with  the  passing  of  time  the  value 
of  X  corresponding  to  a  given  individual  may  change,  this  change  is  slight  only  and 
an  individual  who  is  particularly  prone  to  accidents  in  his  youth  remains  a  bad  risk 
more  or  less  indefinitely. 

The  evidence  in  favor  of  (a),  ( b )  and  ( d )  frequently  appears  quite  convincing. 
Therefore,  in  selecting  personnel  for  certain  hazardous  occupations,  attempts  are 
made  (Farmer  and  Chambers  [3])  to  eliminate  individuals  who  are  particularly 
accident  prone  by  employing  only  those  who  in  the  past  had  no  accidents  of  the 
particular  kind  under  consideration  or  only  a  few  such  accidents.  Also  (Ove  Lund- 
berg  [4])  attempts  are  made  to  use  records  of  accidents  sustained  and  of  cases  of 
illness  to  adjust  the  premiums  in  accident  and  health  insurance  to  actual  risks 
attached  to  particular  individuals.  In  each  instance,  attention  is  directed  to  acci¬ 
dents  or  cases  of  sickness  occurring  in  two  different  periods  of  time  (past  and  future 

This  work,  begun  under  contract  with  the  School  of  Aviation  Medicine,  U.S.  Air  Force,  was 
completed  with  the  partial  support  of  the  Office  of  Naval  Research.  Dr.  Bates,  a,  member  of  the 
faculty  of  Mount  Holyoke  College,  worked  at  the  University  of  California  on  this  project. 

1  Numbers  in  brackets  refer  to  references  at  the  end  of  the  paper. 
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experience)  but  belonging  to  the  same  category.  The  problem  studied  is  essentially 
whether  or  not  the  number  of  accidents  of  a  specified  kind  observed  in  the  past  has 
a  predictive  value  for  the  number  of  accidents  of  the  same  kind  to  be  observed  in 
the  future. 

This  question  is  very  relevant  in  many  cases.  However,  for  certain  purposes  it  is 
not  completely  relevant  and  must  be  modified.  Such,  for  example,  is  the  case  when 
it  is  desired  to  select  appropriate  personnel  for  highly  hazardous  occupations  (for 
example,  airplane  pilots)  where  the  first  accident  observed  is  frequently  also  the  last. 
For  this  very  reason,  in  selecting  the  personnel  it  is  impracticable  to  judge  the  indi¬ 
viduals  on  their  past  experience  with  respect  to  the  particular  severe  accidents  and 
the  most  one  can  do  is  to  see  whether  or  not  the  frequency  of  mild  accidents  incurred 
in  the  past  is  relevant  from  the  point  of  view  of  severe  accidents  to  which  the  indi¬ 
vidual  may  be  exposed  in  the  future. 

Pursuing  this  direction  of  thought  we  shall  study  not  one  but  two  (or  more; 
further  generalization  is  immediate)  random  variables,  say  X  and  F,  representing 
the  numbers  of  accidents  incurred  by  the  same  individual,  either  within  the  same 
period  of  time  or  in  two  different  periods.  The  variable  F  will  mean  the  number  of 
“  predictor7 ;  accidents,  which  we  may  hope  to  be  able  to  observe  prior  to  the  decision 
of  whether  or  not  the  given  individual  is  suitable  for  the  particular  employment. 
On  the  other  hand,  the  random  variable  X  will  be  interpreted  as  the  number  of  severe 
accidents  to  be  observed  in  the  future. 

As  in  the  theory  of  Greenwood,  Yule,  and  Newbold,  we  shall  postulate  that,  for 
each  individual,  the  variables  X  and  Y  are  independent  and  follow  two  distinct 
Poisson  distributions  with  parameters  X  and  y  which  characterize  the  proneness  of 
this  individual  to  the  two  kinds  of  accidents,  Furthermore,  we  shall  postulate  that 
the  values  of  X  and  y  vary  from  one  individual  to  another. 

In  order  that  the  value  of  F  can  serve  as  a  predictor  regarding  the  value  of  X  it  is 
necessary  that  X  and  p  be  correlated  in  the  population  considered  and  the  closer  the 
correlation,  the  greater  the  value  of  F  as  a  predictor.  Whether  or  not  the  constants 
X  and  p,  corresponding  to  two  different  kinds  of  accidents,  are  closely  correlated  is  a 
question  of  fact  and  can  be  answered  only  by  using  appropriate  empirical  data. 

The  main  purpose  of  the  present  paper  is  to  study  the  distribution  of  X  and  F  on 
the  following  somewhat  far-reaching  hypothesis.  This  hypothesis  will  be  frequently 
referred  to  in  this  paper  so  it  will  be  conveniently  labeled  the  fundamental  hypothesis. 
It  involves  two  assumptions : 

i)  the  expectation  p  of  the  number  of  predictor  accidents  is  a  fixed  multiple  of  the 
expectation  X  of  the  number  of  severe  accidents,  p  =  aX,  where  a  is  a  constant; 

ii)  in  the  population  studied  the  distribution  of  A  follows  the  Pearson  type  III 
law  assumed  by  Greenwood,  Yule  and  Newbold,  as  described  in  (c)  above. 

It  will  be  seen  that  assumption  (z)  is  very  strong  and,  a  priori ,  one  is  inclined  to 
doubt  whether  it  could  ever  be  exactly  satisfied.  Surprisingly  enough,  the  theoretical 
joint  distribution  deduced  from  the  fundamental  hypothesis  was  found  to  give  a 
satisfactory  fit  to  several  empirical  distributions.  It  follows,  then,  that  the  measures 
of  success  of  the  selection  for  small  values  of  X  using  F  as  predictor,  deduced  in  this 
paper,  may  not  be  far  off  in  relation  to  real  practical  problems.  Needless  to  say, 
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practical  applications  of  these  formulae  must  be  preceded  by  an  empirical  test  of  the 
validity  of  the  model  studied  with  respect  to  the  particular  accidents  which  may 
come  under  consideration. 

Assumption  (it)  is  also  very  strong.  However,  any  other  assumption  specifying 
the  distribution  of  A  would  be  equally  strong  but,  if  one  wants  to  obtain  numerically 
a  frequency  function  of  X  and  F,  it  is  unavoidable  to  ascribe  a  definite  form  to 
the  distribution  of  A.  The  adoption  of  the  Pearson  type  III  law  is  justified  both  by 
its  flexibility  as  an  interpolation  formula  and  by  the  tradition  established  by  Green¬ 
wood,  Yule,  and  Newbold.  However,  in  the  course  of  the  study  it  appeared  that 
some  properties  of  the  multivariate  distribution  of  the  numbers  of  accidents  satis¬ 
fying  assumption  (i)  are  independent  of  the  actual  form  of  the  distribution  of  A. 
Also,  they  have  an  immediate  bearing  on  the  problem  of  selection  of  personnel  and 
for  these  two  reasons  are  particularly  interesting. 

Part  II  of  the  paper  deals  with  the  possibility  of  a  deeper  insight  into  the  nature 
of  the  mechanism  behind  the  observed  frequency  distribution  of  the  number  of 
accidents  from  one  individual  to  another. 

The  specific  problem  considered  is  that  of  the  distinction  between  the  Greenwood- 
Yule-Newbold  model  described  here  and  the  model  of  Polya  (slightly  generalized), 
assuming  that  the  probabilities  of  accidents  in  a  specified  time  interval  not  only  vary 
with  the  duration  of  this  time  interval,  but  depend  upon  the  number  of  accidents 
previously  sustained  (“contagion”)  and  on  the  length  of  exposure  to  accidents 
which  is  interpreted  as  a  measure  of  the  experience  gained  in  the  particular  kind 
of  work. 

The  details  of  the  plan  of  Part  I  of  the  paper  are  as  follows. 

In  section  2,  the  problem  of  the  joint  distribution  of  severe  and  light  accidents  is 
considered  in  a  form  which  is  a  little  more  general  than  that  envisaged  above. 
Assuming  the  fundamental  hypothesis,  we  consider  not  two  different  kinds  of  acci¬ 
dents  but  an  arbitrary  number  s  ^  2,  of  which  the  first  is  treated  as  “severe  acci¬ 
dents”  and  the  remaining  $  —  1  as  different  kinds  of  light  predictor  accidents. 

Let  X],  X2,  •  •  •  ,  X8  be  the  numbers  of  accidents  of  each  kind.  It  is  found  that 
these  random  variables  follow  a  joint  distribution  which  the  authors  do  not  re¬ 
member  having  seen  before  and  which  they  propose  to  term  the  multivariate  nega¬ 
tive  binomial  distribution.  This  distribution  possesses  several  remarkable  proper¬ 
ties,  similar  to  those  of  the  multivariate  normal  distribution.  The  more  important 
of  these  properties  refer  to  any  group  of  m  <  s  variables  out  of  the  s  considered. 

i)  Whatever  the  group  of  m  variables,  for  example,  Xi,  X2,  *  *  • ,  Xm,  the  marginal 
joint  distribution  of  this  group  is  an  ra- variate  negative  binomial. 

ii)  The  joint  distribution  of  Xh  X2,  •  •  •,  Xm  and  of  the  sum,  say  x  =  Xm+i 
+  Xm+2  +  ■  *  *  +  X8  is  an  (m  +  1) -variate  negative  binomial  distribution. 

in)  The  conditional  joint  distribution  of  Xi,  X2,  •  •  *,  Xm,  given  that  the  other 
s  —  m  variables  have  assumed  specified  values,  is  an  w-variate  negative  binomial 
distribution  and  depends  only  on  the  value  x  of  the  sum  %■ 

iv)  The  regression  of  Xi  on  Xm+h  Xm+2)  •  •  • ,  Xs  is  linear,  for  m  =  1,  2,  •  *  • ,  s  —  1. 

Because  of  property  (in),  the  general  case  of  s  —  1  ^  1  kinds  of  light  accidents 
reduces  to  the  simplest  case  involving  only  two  categories  of  accidents,  severe  acci- 
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dents  and  light  accidents,  with  the  latter  category  embracing  all  the  s  —  1  different 
categories  of  light  accidents  originally  considered. 

Section  3  contains  formulae  leading  to  the  estimates  of  the  parameters  in  the  bi¬ 
variate  negative  binomial  distribution. 

Section  4  is  given  to  an  empirical  test  of  the  fundamental  hypothesis.  As  men¬ 
tioned  above,  the  basic  idea  is  that,  for  particular  individuals  in  a  population,  the 
expected  number  of  light  accidents  in  an  earlier  period  is  a  fixed  multiple  of  the 
expected  number  of  severe  accidents  in  a  subsequent  period.  Unfortunately,  no 
empirical  data  were  available  with  which  the  authors  could  test  directly  whether  or 
not  it  is  safe  to  assume  this.  The  best  that  could  be  done  was  to  study  certain  anal- 
agous  situations  for  which  the  data  could  be  obtained.  On  the  whole,  the  results 
of  this  empirical  study  are  promising. 

The  fundamental  hypothesis  is  tested  on  two  sets  of  data,  one  of  which  is  new. 
Because  of  the  scarcity  of  published  empirical  material  of  this  particular  kind,  the 
new  data  are  reproduced  in  this  paper  in  several  tables  which  may  be  useful  for 
further  work. 

Section  5  is  given  to  the  following  practical  question:  assuming  the  admittedly 
far-reaching  fundamental  hypothesis  regarding  the  close  connection  between  light 
and  severe  accidents,  what  are  the  prospects  of  success  in  the  selection  of  personnel 
using  the  records  of  light  accidents?  It  is  shown  that,  in  certain  cases  at  least,  the 
effect  of  selection  must  be  substantial. 

Section  6  outlines  methods  to  be  used  if  and  when  data  on  light  and  on  severe 
accidents  are  available.  The  study  of  severe  accidents  differs  from  that  of  light 
accidents  by  the  fact  that  severe  accidents  are  frequently  not  survived  by  the 
victims.  Consequently,  even  if  the  model  treated  in  this  paper  is  strictly  applicable 
to  light  and  severe  accidents,  because  of  the  distortions  caused  by  fatal  accidents, 
the  joint  distribution  of  the  numbers  of  light  and  severe  accidents  will  not  be  the 
bivariate  negative  binomial.  Therefore,  any  empirical  study  relating  to  light  and 
severe  accidents  will  require  an  appropriate  distribution.  Such  a  distribution,  based 
on  the  assumption  that  the  probability  of  surviving  an  accident  is  constant,  is  given 
in  section  6. 

Throughout  the  paper  the  notation  adopted  is  that  of  J.  Neyman’s  recent  book 

[7]. 

2.  Multivariate  distribution  of  the  numbers  of  accidents.  The  subject  of  this 
section  is  the  joint  distribution  of  an  arbitrary  number  s  of  random  variables  Xh 
X2,  •  •  • ,  Xs,  where  X{  represents  the  number  of  accidents  of  the  fth  kind,  incurred 
by  an  individual  randomly  drawn  from  a  population. 

The  method  used  is  that  of  probability  generating  functions,  introduced  by  La¬ 
place.  A  modern  presentation  of  the  theory  is  given  by  Feller  [6].  The  probability 
generating  function  is  defined  for  sets  of  random  variables  all  capable  of  assuming 
only  nonnegative  integer  values.  It  will  be  denoted  by  G  with  subscripts  indicating 
the  random  variables  to  which  it  refers.  The  arguments  of  G  will  always  be  assumed 
not  to  exceed  unity  in  absolute  value  so  as  to  insure  the  convergence  of  the  series 
representing  G .  When  dealing  with  conditional  distributions,  the  hypotheses  on 
which  these  distributions  are  based  will  be  symbolized  to  the  right  of  the  vertical 
bar  that  follows  the  arguments  of  the  probability  generating  function.  Thus  the 
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conditional  probability  generating  function  of  the  random  variables  Xi,  X2,  •  *  *,XS, 
given  a  hypothesis  H  will  be  denoted  and  defined  as 


(3) 


G 


xltx2 


'X, 


(uU  U<L, 
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I H)  =  E 
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u , 
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s 

=  E  P{(Xi  =  m)  (X*  =  nO  •  •  •  (X.  =  «.))  n«? 

,n2  ,  .  .  .  ,ws , 


where  the  summation  extends  over  all  nonnegative  values  of  each  =  0,  1,  2,  •  *  • , 
for  i  =  1,  2,  ■  •  *,  s. 

In  the  following,  we  shall  use  several  properties  of  probability  generating  functions 
which  are  direct  consequences  of  the  above  definition. 

Generalizing  the  conditions  of  the  problem  studied  by  Greenwood,  Yule,  and 
Newbold,  we  assume  that  to  the  population  studied  and  the  s  different  kinds  of 
accidents  there  correspond  s  positive  numbers  aj  =  1,  a2,  a3,  •  •  *,  a8.  Thus,  these 
numbers  are  the  same  for  each  individual  of  the  population.  We  assume  further  that 
to  every  individual  of  the  population  there  corresponds  a  positive  number  X, 
measuring  his  particular  proneness  to  accidents.  For  an  individual  to  be  randomly 
drawn  from  the  population,  this  number  is  interpreted  as  a  particular  value  of  a 
random  variable  A.  The  distribution  of  A  will  be  denoted  by  F(X)  =  P{A  ^  X}. 
Some  of  the  results  obtained  are  independent  of  any  assumption  regarding  F(X) 
except  that  F( 0)  =  0  so  that  A  is  necessarily  a  positive  random  variable.  However, 
most  of  the  results  are  based  on  the  assumption  that  the  distribution  function  of  A 
has  the  particular  form  postulated  by  Greenwood,  Yule,  and  Newbold,  representing 
the  integral  of  the  probability  density  (2). 

Given  a  particular  individual  of  the  population,  that  is,  given  a  fixed  value  of  X, 
we  shall  assume  that  the  numbers  of  accidents  X1}  X2,  •  •  * ,  Xs  are  mutually  inde¬ 
pendent  and  that  each  follows  a  Poisson  law  with  the  expectation  of  equal  to  a»X, 
i  =  1,  2,  •  «  • ,  s.  It  follows  that,  given  X,  the  conditional  joint  probability  generating 
function  of  Xh  X2,  *  •  • ,  X8  is 


(4) 


,X2 , .  .  .  ,Xs(Ul’  U2’ 


^  *- 1 


Replacing  in  (4)  X  by  the  random  variable  A  and  taking  the  expectation  with 
respect  to  the  distribution  of  this  variable,  we  obtain  the  absolute  probability  gener¬ 
ating  function, 


(5) 


G 


x„x2 


(ui,  112, 


,  US)  =  E  [ff , 


,x8 


(uh  u2, 


us  I  A)] 
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where  </>(£)  is  the  Laplace  transform  of  the  distribution  F(\), 

(6)  4>(t)  =  f“ea  dF  (A)  . 

Jo 

It  will  be  seen  that  for  t  <  0  the  function  is  indefinitely  differentiable. 

The  Laplace  transform  of  the  distribution  defined  by  (2)  is,  say 

_  CO  _  _ | 

(7)  *'(»-],  *■>,(«*-  L1 

Thus,  on  the  assumption  that  (2)  represents  the  probability  density  of  A,  the  joint 
probability  generating  function  of  Xi,  X2,  •  •  • ,  X,  is,  say 


(8) 


. a',(Mi>  '  -  ->Ms) 


i  +  &.(i  —  ut) 


4=1 


3 


where,  for  the  sake  of  simplicity  in  formulae,  bi  =  at-//3,  i  =  1,  2,  •  •  •,  s. 

Owing  to  the  particular  form  of  the  probability  generating  function  (8),  the  cor¬ 
responding  distribution  of  Xh  X2,  •  *  • ,  X8  will  be  called  the  s-variate  negative  bi¬ 
nomial  distribution.  Easy  expansion  of  (8)  in  powers  of  uh  u2)  •  •  •,  u3  gives 

(9)  P{(Zi  =  n: 0  (X2  =  n2)  •  •  •  (X,  =  ns)j 


where  n  —  n\  +  n2  + 


1  +  2 

4=1 


r(a  +  w) 

r(«) 


n 

4=1 


+  ns  and 


1  +  23  bj  (3  +  a. 


3= i  ?=i 

The  distributions  defined  by  (5)  and  (8)  possess  the  following  remarkable  properties. 

Let  rh  r2,  •  •  •,  rs  be  any  permutation  of  numbers  1,  2,  •  ■  •,  $  and  let  m  be  any 
positive  integer  less  than  s. 

Theorem  1.  If  the  random  variables  Xh  X2)  •  •  •,  X8  follow  the  multivariate  nega¬ 
tive  binomial  distribution  (8)  then  the  joint  distribution  of  Xri,  Xr2,  •  •  •  ,XTm  is  also 
negative  binomial . 

The  probability  generating  function  of  the  marginal  distribution  of  Xn,  Xn,  ■  *  •, 
Xrm  is  obtained  from  (8)  by  substituting  uTm+1  =  urm+2  =  •  •  •  =  uTt  =  1.  It  is  easily 
seen  that  the  result  of  this  substitution  is  a  function  of  the  same  type  with  the  sum 
of  m  terms 

m 

(11)  Z  br(l  -  u  ) 

4=1 


replacing  in  the  square  brackets  the  sum  of  s  similar  terms  and  the  theorem  is  proved. 
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Theorem  2.  Whatever  the  distribution  F(k)  of  A,  given  that  the  sum ,  say 

s 

(12)  X  =  Z-V-  , 

i=l 


has  assumed  a  value  n,  the  conditional  joint  distribution  of  Xh  X2,  •  •  •  Xs~i  is  the  multi¬ 
nomial  distribution  with  the  probability  generating  function 

[s-l  “In 

2  diUi  “l-  ds  , 

with 

(14)  di  =  ,  i  =  1,  2,  •  •  -,s  . 

±*, 

1=1 

Starting  with  the  definition,  the  generating  function 


8-1 

—  E  vx>  XI  (ulv)x* 

t=i 


=  G 


x  I  ,X2  ,  .  ,  .  _  !  ,X, 


(UiV,U2V}  •  •  * ,Us-iV,V ), 


and,  therefore,  because  of  (5) 


8 

8  "J 

(16) 

Gx,  ,x... 

• ,  M,_i,  V)  =  <j> 

v  a  m  i 

Z  o-i 

i=  1  J 

In  order  to  obtain  the  probability  generating  function  (13)  it  is  sufficient  to  ex¬ 
pand  (16)  in  powers  of  v,  to  select  the  coefficient  of  vn  and  to  divide  this  coefficient 
by  its  value  corresponding  to  U\  =  U2  =  •  *  •  =  “  1.  It  is  easy  to  see  that  the 

result  of  this  operation  coincides  with  (13). 

Theorem  3.  Whatever  be  the  distribution  function  F(K)  of  A,  given  that  XTi  =  nri, 
i  s  m  +  1,  m  +  2,  •  ■  • ,  s,  the  conditional  distribution  of  Xrj1  for  j  =  1,  2,  *  •  •,  m, 
depends  only  on  the  sum 


(17) 


8 

n  =  Z  »r< 

i=m+l 


not  on  the  numbers  nri  taken  separately. 
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This  theorem  describes  a  very  important  property  of  the  joint  distribution  of  acci¬ 
dents  satisfying  assumption  ( i ).  Owing  to  this  property,  the  problem  of  predicting 
the  number  of  severe  accidents,  say,  the  number  X\  of  accidents  of  the  first  kind, 
using  the  numbers,  e.g.,  Xm+i,  Xm+2,  ■  •  Xs,  of  accidents  of  some  s  —  m  other 
kinds  reduces  to  that  of  predicting  Xi  using  the  value  of  the  sum,  say 

(18)  Y  —  £Xi. 

i=m+ 1 

Thus,  whatever  be  the  relative  frequency  of  the  predictor  accidents  as  measured  by 
the  constants  am+1,  am+2,  *  *  • ,  a8,  in  order  to  predict  the  value  of  Xx  no  weighing  of 
the  numbers  of  these  accidents  is  necessary,  and  this  irrespective  of  the  actual  form 
of  the  distribution  of  A. 

Obviously,  it  will  be  sufficient  to  prove  theorem  3  for  n  =  i,i  =  1,2,*  *  • ,  5.  By 
examining  the  definition  of  the  probability  generating  function  it  is  easy  to  see  that 
the  conditional  probability  generating  function  of  Xh  X2,  •  •  *,  Xm  given  that  the 
other  variables  Xm+h  Xm+2,  *  •  • ,  Xs  have  assumed  some  specified  values  nm+h  nm+2, 

•  •  • ,  ns,  respectively,  is  obtained  from  (5)  as  a  result  of  the  two  following  operations. 

a)  Expand  (5)  in  powers  of  um+i,  um+2,  •  •  •,  us  and  obtain  the  coefficient  C  of  the 
product 

(19)  n  . 

i=m+ 1 

Obviously,  C  is  a  function  of  uh  u2,  •  •  • ,  um. 

b)  Divide  C  by  the  value  of  this  coefficient  corresponding  to  ux  =  u2  =  •  •  •  = 
um  1 . 

Performing  these  operations  on  (5),  we  obtain 


(20)  C  =  n  — f 

t=m+l  Ui  * 


where  <t>in\t)  denotes  the  nth  derivative  of  0  with  respect  to  t  and  where 


(21) 


ra  s  m 

l  —  ’  X)  ai(Ui  —  1)  “*  a3  —  23  ai(Ui  —  1)  +  T  ,  Say  . 

J=1  }=m+L  i=l 


It  follows  that 


(22)  @xvx2 . x 

m 


•,  v* 


( Xm+\  nm+ 1)  *  •  •  {X s  —  ^s)]  — 


<^>(7l)(r) 


It  is  seen  that  the  right-hand  side  depends  on  the  sum  n  of  the  values  assumed  by 
the  variables  Xm+i,  Xm+2)  •  •  *,  X8  but  not  on  these  values  taken  separately,  which 
proves  theorem  3. 

Theorem  4.  If  the  variables  Xx,  X2,  •  •  • ,  Xs  follow  the  multivariate  negative  bi¬ 
nomial  distribution  (8)  then ,  given  Xn  =  nr.for  i  =  m  +  1,  m  +  2,  *  •  s,  the  condi¬ 
tional  distribution  of  Xn,  Xr2,  ■  •  • ,  XTm  is  also  a  negative  binomial  distribution  depend¬ 
ing  on  the  sum  n  defined  by  (17). 
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It  will  be  sufficient  to  prove  theorem  4  assuming  r*  =  i  for  i  =  1,  2,  •  •  ♦,  s.  The 
proof  may  either  be  direct,  starting  from  (8)  or  take  into  account  (22)  and  evaluate 
the  nth  derivative  of  (7).  We  have 


(23) 

dn4>* 

1  r(«  +  n)  r  fl 

dln 

dn  r(«)  L  dJ 

and  it  follows  that 

(24) 

Gx  x 

11 

4 

V - 

4 

with 

di 

(25)  et  = - - - 

P  +  ai 

j=m+ 1 

which  proves  the  theorem. 

As  a  result  of  theorem  3,  the  conditional  distribution  of  Xh  X2,  •  •  • ,  Xm,  given 
Xm+1  =  nm+h  ■  ■  •,  Xa  =  n„  will  be  identified  with  the  conditional  distribution  of 
the  same  variables,  given  that  the  sum  Y  defined  in  (18)  has  assumed  the  value  n 
of  (17).  In  particular,  the  multiple  correlation  coefficient  of  Xi  and  Xm+i,  -X"m+2,  •  •  • , 
Xt)  say  p,  coincides  with  the  ordinary  correlation  coefficient  of  X,  and  Y  as  defined 
in  (18) .  In  order  to  study  the  regression  of  Xi  on  Xm+i,  Xm+2,  ■  ■  ■ ,  Xs  or  the  multiple 
correlation  p,  it  will  be  sufficient  to  consider  the  probability  generating  function  of 
X ,  and  F  obtainable  either  from  (5)  or  from  (8)  by  substituting  ui  =  u,  u2  =  us  = 
.  .  .  =  Um  =  1  and  Ji»+i  =  um+ 2  =•■•=«,  =  ».  Thus  formula  (5)  gives 

(26)  Gx .i  Y(u,  v)  =  <j>[ai(u  -  1)  +  A(v  -  1)] 
where,  for  short, 

(27)  A  = 

i=7W+l 


Theorem  5.  Whatever  the  distribution  function  F(\)  of  A,  provided  it  possesses  two 
first  moments ,  the  square  of  the  correlation  coefficient  p2  between  Xi  and  Y  is  given  by 


where  fi\  is  the  expectation  of  A  and  a l  its  variance. 

In  order  to  deduce  formula  (28)  we  use  the  familiar  relations  between  the  moments 
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of  the  random  variables  and  the  derivatives  of  their  probability  generating  function 
evaluated  at  the  values  of  arguments  equal  to  unity.  In  particular 


(29) 


E(Xi) 


dG 


XUY 


dll  u=v—i 


-  =  aim  , 


d*G 


(30)  E(X-{)  -  E(XJ 
and  it  follows 

(31) 

Also, 

(32) 

and 

(33) 

Finally,  we  get 


*,.1' 


du 2 


u—v~\ 


=  a\4>"( 0)  =  a*E(  A2) 


4,  =  aK  +  • 


E(Y)  =  ,  4  =  AV  +  Ah 


a2(? 


E(XhY)  = 


X,.Y 


du  dv 


(34) 


p  = 


_  \E(XU  Y)  -  FAX,)  E(YW 


=  axA  E( A2)  . 


aY 


(ai°A  Mx)  +  Ml) 


which  coincides  with  the  second  part  of  (28).  In  order  to  obtain  the  first  part  of  this 
formula,  we  notice  that 


(35) 


ai°f 

vi  +  Mi 


and  a  similar  relation  for  F. 

Theorem  4  implies  important  conclusions  regarding  the  possibility  of  predicting 
the  value  Xi  by  using  the  values  assumed  by  X2,  Xs,  •  *  -,X3. 

Corollary  1.  If  p  is  taken  as  a  conventional  measure  of  precision  in  predicting  the 
value  of  Xx  from  the  observed  values  of  Xm+h  Xm+2 ,  •  •  Xs,  then ,  whatever  be  F(\)f  it 
is  advantageous  to  use  as  many  predictors  as  possible ,  that  is,  it  is  advantageous  to 
take  m  =  1. 

This  conclusion  is  the  immediate  result  of  the  fact  that  p  is  an  increasing  function 
of  A  as  defined  in  (27). 

Corollary  2.  Whatever  the  distribution  function  F(\),  and  whatever  the  number  of 
predictor  variables  Xm+i ,  Xm+2,  *  •  *,  Xs,  the  correlation  p  must  be  smaller  than  the 
upper  bound 


(36) 


< 


E(Xi) 


depending  only  on  the  expectation  and  on  the  variance  of  the  predicted  variable  Xu 
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Formula  (36)  is  an  immediate  consequence  of  the  first  part  of  (28).  The  practical 
conclusion  is  that,  before  attempting  to  use  the  numbers  of  any  accidents  in  order 
to  predict  the  value  of  Xh  one  should  estimate  the  expectation  of  X\  and  its  variance. 
If  the  right-hand  side  of  (36)  is  close  to  zero  then  the  prospects  of  attaining  a  good 
prediction,  at  least  by  means  of  a  linear  regression  equation,  are  slim. 

Theorem  6.  If  the  random  variables  Xh  X2,  •  •  * ,  X»  jointly  follow  a  multivariate 
negative  binomial  distribution ,  then  the  regression  of  X1  on  the  sum  Y  as  defined  by 
(18),  is  linear ,  and  namely 


E{X  i  I  Y  =  n) 


gfa  +  n) 

P  +  ^2  aj 


Under  the  hypotheses  of  the  theorem,  the  conditional  probability  generating 
function  of  Xh  given  Y  =  n,  is  obtained  from  (24)  by  substituting  u2  =  uz  =  •  ■  •  = 
um  =  1  and  we  have 

(38)  Cry  (ui  I  Y  =  n)  —  [1  +  ei(l  —  uff]  (o+n)  . 


The  regression  function  of  Xi  on  F  is  obtained  by  differentiating  (38)  and  by  setting 
U\  —  1.  The  result  is  (37). 

Theorems  1,  4,  and  6  describe  interesting  properties  of  the  multivariate  negative 
binomial  distribution  whereby  it  is  somewhat  similar  to  the  multivariate  normal. 
Naturally,  however,  the  analogy  is  far  from  complete.  Thus,  for  example,  the  condi¬ 
tional  variance  of  X\  given  F,  or  given  any  single  variable  X,-,  zV  1,  is  not  constant 
but  increases  linearly  with  the  value  of  the  fixed  variable.  Furthermore,  the  sum  of 
two  independent  negative  binomial  variables  may  but  need  not  be  a  negative  bi¬ 
nomial,  etc. 

3.  Estimation  of  parameters  in  the  bivariate  negative  binomial  distribution.  In 

section  2  it  was  shown  that,  when  the  model  considered  applies,  the  s-dimensional 
problem  may  be  reduced  to  a  two-dimensional  problem.  In  particular,  if  formula  (2) 
adequately  represents  the  probability  density  function  of  A,  then,  in  order  to  treat 
the  problem  of  predicting  the  number,  say  X  =  x,  of  severe  accidents  using  any 
number  of  categories  of  light  accidents,  it  is  sufficient  to  study  a  bivariate  negative 
binomial  distribution  of  X  and  F,  where  F  stands  for  the  total  number  of  light 
accidents  embracing  all  the  s  —  1  different  categories  originally  considered.  Re¬ 
membering  the  convention  ax  =  1,  the  joint  probability  generating  function  of  X 
and  F  may  be  written  as 

(39')  °X-Y  ('U>  ^  =  [8  +  (1  —  u)  +  MX  -  y)]“ 

with  A  =  02  +  a3  +  •  •  •  +  as.  By  expanding  (39)  in  powers  of  u  and  we  obtain 

as  the  coefficient  of  ukvm 

n  r(q  "F  m  ~|~  fc)  \m( f. >  .l  a  t  i 

Px,  r  (k,  m)  -  0  A  03  +  A.  +  1) 


(40) 


(a+m  +  fe) 
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We  shall  suppose  that  n  independent  observations  on  the  pair  (X,  Y)  will  be  made. 
The  letter  nktm  will  then  denote  the  random  variable  representing  the  number  of  pairs 
[  (X  =  k),  (Y  =  m)  ] .  The  joint  frequency  function  of  all  the  nk,m  is  represented  by  the 
product,  say 

(41)  J-Cfl  II  m)  , 

k^O  m=  0  X'Y 

where  C  stands  for  a  factor  depending  on  the  nk,m  but  not  on  the  parameters  a,  j 6,  and 
A,  and  where 

OO  CO 

(42)  j  ^  ;  nk  ,m  ~  n  • 

k—Q  m~  0 

Our  problem  is  to  deduce  formulae  for  the  maximum  likelihood  estimates,  say  an, 
pn  and  An  of  these  three  parameters.  Recent  results  [8]  imply  that  these  estimates 
possess  the  following  properties:  (i)  the  estimates  are  functions  of  the  relative  fre¬ 
quencies  nk,m/n  but  do  not  depend  otherwise  on  n;  (ii)  the  estimates  possess  continu¬ 
ous  partial  derivatives  with  respect  to  each  relative  frequency;  (in)  as  n-+<x> ,  the 
estimates  are  consistent  and  asymptotically  normal  about  the  true  values  of  the  par¬ 
ticular  parameters;  (iv)  the  asymptotic  variances  of  the  estimates  an,  pn,  and  An  de¬ 
crease  as  n *  and  do  not  exceed  the  asymptotic  variances  of  any  other  estimates 
possessing  the  properties  (i)f  (ii)  and  (Hi).2 

Substituting  (40)  into  (41),  taking  logarithms  and  dividing  by  n,  we  obtain 

(43)  1  log  J  =  Cl  +  a  log  J8  -  (a  +  X  +  f)  log  OS  +  A  +  1)  +  f  log  A 

+  S  (l  ~  Z  9)  log  (a  +  0 

\  r_o  J 

where  C\  represents  a  term  independent  of  the  parameters  and  where 

X=l-±kt  nk,m  , 

n  Jt-0  m~0 


2  Until  recently  it  was  believed  that  the  asymptotic  variances  of  the  maximum  likelihood  esti¬ 
mates  cannot  exceed  those  of  any  other  consistent  and  asymptotically  normal  estimates.  A  con¬ 
jecture  to  this  effect  is  usually  ascribed  to  R.  A.  Fisher,  who,  since  1921  [10],  has  repeatedly  pro¬ 
claimed  the  above  statement  as  a  property  of  maximum  likelihood  estimates.  In  this  connection, 
see  also  F.  Y.  Edgeworth  who  enunciated  in  his  paper  [9]  of  1908  essentially  the  same  conjecture 
(with  a  vague  restriction  on  the  nature  of  the  estimate).  Although  the  proofs  of  both  Edgeworth 
and  Fisher  obviously  lack  precision,  this  conjecture  was  generally  taken  for  granted  and  quoted 
in  many  articles  and  books.  Recently  J.  L.  Hodges,  Jr.  [11]  has  produced  examples  of  consistent 
and  asymptotically  normal  estimates,  not  having  the  properties  (i)  and  {ii),  whose  asymptotic 
variances  never  exceed  those  of  the  maximum  likelihood  estimates  and,  for  some  values  of  the 
parameter,  are  actually  smaller. 
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Obviously,  qr  represents  the  relative  frequency  of  pairs  (X,  Y)  which  have  their  sum 
X  +  Y  =  r.  The  maximum  likelihood  equations  are  obtained  by  differentiating  (43) 
with  respect  to  a,  /3  and  A  and  by  equating  the  derivatives  to  zero.  We  have: 


(45) 


log 


i  -  E  ?. 

CO  0 

-  +  2 - 

3  +  A  +  1  ‘"°  a  +  t 


0 


0 , 


(46) 


a  +  X  +  Y 

A  A 

0  +  A  +  1 


=  0, 


(47) 


Equations  (46)  and  (47)  imply 

(48) 

(49) 

and  then  equation  (45)  gives 


8  +  X  +  Y 

A  A 

fi  +  A  +  1 


5  =  #3, 
y  =  xi , 


=  o 


(50) 


log  f  1  + 


X  +  Y 


=  E 


t 

E  9- 

r=0 


\ 


+  ^ 


The  problem  of  computing  the  maximum  likelihood  estimates  a,  ^and  A_is  thus 
reduced  to  the  following  operations.  First  we  calculate  the  means  X  and  Y  of  the 
observed  values  of  X  and  Y,  respectively,  and  the  relative  frequencies  qr  as  indicated 
in  formulae  (44).  Upon  substituting  them  into  (50)  the  trial  and  error  method  gives 
the  value  of  a.  Next 

A  sy  A  V 

(51)  fi  =  “  ,  A  . 

X  X 

In  trying  to  obtain  a  it  is  well  to  notice  that  the  two  sides  of  equation  (50)  tend  to 
the  same  limit  zero  as  a  is  indefinitely  increased.  The  first  trial  value,  say  a0,  may 
be  conveniently  obtained  as  follows.  We  notice  that  the  result  of  substituting  a, 
$  and  A  in 


(52) 


VX  Y  (°»  0) 


=  ( _ £ _ ) 

\B  +  A  +  1/ 


should  give  a  result  comparable  to  q0.  Using  equations  (48)  and  (49)  we  have 

3  ot 


3  +  -4  +  1 


+  X  +  Y 


(53) 
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Thus,  the  first  trial  value  of  a  can  be  taken  to  satisfy  the  equation 


(54) 


«0 


+  X  +  Y / 


=  qo 


which  is  equivalent  to 
(55) 

with  2  =  (X  +  Y)/oi0. 
function 


log(l  +*)  =  -  -l°-g  qozz 
X  +  Y 

In  order  to  obtain  a0  we  make  a  graph  of  the  logarithmic 


(56) 

Next  we  plot  the  straight  line 

(57) 


y  =  log  (1  +  z)  . 


y  =  “ 


log  go  z 


X  +  Y 


The  two  lines  have  two  points  in  common,  one  at  zx  =  0  and  the  other  at  z2  = 
(X  +  Y)/a0 ,  which  is  obtained  graphically.  When  z2  is  obtained,  we  get  a0  = 
(X  +  Y)/z2. 

4.  Empirical  test  of  the  fundamental  hypothesis.  As  mentioned  before,  the  validity 
of  the  fundamental  hypothesis  considered  in  this  paper  and,  in  particular,  of  the 
joint  bivariate  negative  binomial  distribution  (40),  should  be  tested  with  respect 
to  the  particular  types  of  accidents  that  may  come  under  study.  Thus,  for  example, 
if  it  is  attempted  to  apply  the  conclusions  of  this  paper  to  the  selection  of  airplane 
pilots  through  the  use  of  an  individual's  record  of  minor  accidents  during  the  years 
before  the  training  in  order  to  obtain  individuals  with  low  proneness  for  aviation 
accidents,  then  the  validity  of  the  fundamental  hypothesis  should  be  tested  on  obser¬ 
vations  regarding  the  numbers  X  and  Y  of  each  kind  of  accident  actually  suffered 
by  a  number  of  individuals.  Owing  to  the  lack  of  data,  no  such  test  is  possible  at 
present.  However,  because  of  the  far-reaching  character  of  the  fundamental  hypoth¬ 
esis,  it  is  of  interest  to  inquire  whether  or  not  there  are  any  accidents  at  all  with 
respect  to  which  this  hypothesis  is  at  least  approximately  true. 

To  investigate  this  point,  formula  (40)  was  tried  in  connection  with  the  following 
two  sets  of  data.  The  first  set  was  obtained  through  the  courtesy  of  Dr.  Rosedith 
Sitgreaves  and  Dr.  W.  M.  Gafafer,  to  whom  the  authors  are  deeply  indebted.  Special 
thanks  are  due  to  Dr.  J.  G.  Townsend,  Chief,  Division  of  Industrial  Hygiene, 
Public  Health  Service,  Federal  Security  Agency,  who  released  the  data  collected  by 
the  Division  of  Industrial  Hygiene. 

The  data  are  concerned  with  two  different  categories  of  employees  of  an  industrial 
establishment:  Group  1  =  office  workers,  and  Group  2  =  industrial  workers.  For 
each  of  these  two  groups  the  data  list  the  numbers  of  cases  of  incapacity  suffered 
during  a  period  of  time  due  to  the  following  causes: 

Cause  X  =  Respiratory  disease 
Cause  2  —  Digestive  disease 
Cause  3  —  Nonindustrial  injury 
Cause  4  =  Industrial  injury 
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Each  case  of  incapacity  from  any  of  the  four  causes  was  treated  as  an  accident  of  a 
special  category. 

The  other  set  of  data  on  which  the  test  of  the  fundamental  hypothesis  was  made 
is  taken  from  the  publication  of  Farmer  and  Chambers  [3] .  This  is  concerned  with 
accidents  incurred  by  166  London  bus  drivers  during  five  successive  years  of  service. 
On  these  data,  two  tests  of  the  model  were  made,  once  taking  the  experience  of  the 
first  four  years  of  service  of  each  man  as  one  variable  and  the  experience  of  the  fifth 
as  the  other  and  then  treating  the  number  of  accidents  in  the  first  year  of  service 
as  one  of  the  two  variables  and  the  number  of  accidents  in  the  subsequent  four  years 
as  the  other. 


TABLE  1 

Test  of  the  Validity  of  the  Fundamental  Hypothesis  on  Two  Sets  of  Data 


Data  on: 

Estimated  parameters 

No.  of  in- 

Degrees 

of 

freedom 

P(x 2) 

a 

P 

A 

dividuals 

Employees  of  an  industrial 
concern 

Cause  1  vs  2,  Gr.  1 . 

1.452 

1.407 

4.729 

407 

37 

.10 

Cause  1  vs  2,  Gr.  2 . 

1.471 

1.050 

3.798  . 

1272 

95 

Practically  zero 

Cause  1  vs  3,  Gr.  1 . 

1.657 

4.750 

13.986 

407 

39 

.090 

Cause  1  vs  3,  Gr.  2 . 

1.686 

4.734 

15.075 

1272 

58 

.00053 

Cause  2  vs  3,  Gr.  1 . ' 

0.922 

2.662 

2.979 

407 

16 

.0017 

Cause  2  vs  3,  Gr.  2 . 

0.846 

2.377 

3.978 

1272 

30 

Practically  zero 

Cause  3  vs  4,  Gr.  1 . 

1.309 

28.046 

8.421 

407 

3 

.59 

Cause  3  vs  4,  Gr.  2 . 

1.385 

3.888 

0.740 

1272 

11 

Practically  zero 

London  bus  drivers 

Fifth  year  vs  four  first 
years . 

3.490 

2.021 

4.125 

166 

38 

.35 

First  year  vs  last  four 
years . 

5.596 

3.086 

3.419 

166 

32 

.21 

Table  1  gives  the  results  of  all  these  tests.  The  first  three  columns  give  the  values 
of  the  estimated  parameters  of  the  distribution  (40),  the  fourth  column  gives  the 
number  of  individuals  to  whom  the  particular  observations  refer,  the  fifth  the  num¬ 
ber  of  degrees  of  freedom  in  applying  the  x2  test  and  the  sixth  the  value  of  the  prob¬ 
ability  P  (x2)  of  obtaining  a  value  of  x2  exceeding  that  observed. 

Tables  2  to  11  give  the  bivariate  distributions  and  the  details  of  comparisons 
between  the  theory  and  the  observations  summarized  in  table  1.  Thin  lines  indicate 
the  boundaries  of  the  particular  cells.  Heavy  lines  indicate  the  grouping  adopted  in 
the  application  of  the  x2  tost.  Observed  frequencies  are  written  in  the  upper  left 
corner  of  particular  cells.  The  two  other  figures,  each  with  one  decimal  digit,  are  the 
expected  frequency  (on  the  left)  and  the  contribution  to  x2  of  one  particular  cell 
(if  the  expected  frequency  for  that  cell  is  3  or  more)  or  for  a  group  of  several  adjoin¬ 
ing  cells.  If  the  expected  frequencies  of  several  cells  are  found  to  be  less  than  3,  then 
they  are  grouped  and  the  expected  frequency  is  given  for  the  entire  group  of  cells 
only. 
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TABLE  5 

son  of  Observed  and  Theoretical  Distributions  of  Incapacities 
1  vs.  Cause  3,  Group  2  (Div.  Ind.  Hyg.,  U.S.  Pub.  Health  Serv.) 


Cause 
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TABLE  6 

Comparison  of  Observed  and  Theoretical  Distributions  of  Incapacities 
Cause  2  vs.  Cause  3,  Group  1  (Div.  Ind.  Hyg.,  U.S.  Pub.  Health  Serv.) 


0~ 1  i  2  3  ^4 

Cause  3 
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TABLE  7 

Comparison  of  Observed  and  Theoretical  Distributions  of  Incapacities 
Cause  2  vs.  Cause  3}  Group  2  (Div.  Ind.  Hyg.,  U.S.  Pub.  Health  Serv.) 
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It  will  be  seen  that  in  three  out  of  the  ten  cases  studied  the  fit  provided  by  the 
bivariate  negative  binomial  is  excellent.  In  two  additional  cases,  the  fit  is  not  very 
good  but  still  passable.  In  the  remaining  five  cases  the  fit  is  poor. 

The  data  summarized  in  table  1  refer  to  three  groups  of  workers  and  the  three 
samples  contain  166,  407,  and  1272  individuals,  respectively.  Cases  of  good  and  of 
bad  fit  are  unevenly  distributed  and,  in  fact,  in  all  cases  relating  to  the  largest 
number  the  fit  is  bad.  This  suggests  that,  probably,  the  true  distribution  of  numbers 


TABLE  8 

Comparison  of  Observed  and  Theoretical  Distributions  of  Incapacities 
Cause  3  vs.  Cause  4,  Group  1  (Div.  Ind.  Hyg.,  U.S.  Pub.  Health  Serv.) 


Cause  4- 

of  accidents  does  not  coincide  with  the  negative  binomial  in  any  of  the  cases  studied. 
However,  the  divergence  between  the  actual  distribution  and  the  negative  binomial 
must  be  only  slight  and  to  detect  it  one  needs  a  substantial  number  of  observations. 

Furthermore,  a  closer  examination  of  tables  where  the  fit  is  poor  suggests  that  this 
may  be  due  to  the  coexistence  of  two  distinctly  different  subgroups  of  individuals, 
one  large  and  one  relatively  small,  with  two  different  machineries  behind  the  distri¬ 
bution  of  accidents.  Owing  to  the  difference  in  weights,  the  bivariate  negative  bi¬ 
nomial  approximates  the  actual  distribution  in  the  larger  subgroup.  However,  the 
presence  of  the  divergent  smaller  subgroup  spoils  the  fit. 

This  conclusion  is  suggested  by  all  the  tables  but  the  suggestion  is  particularly 
strong  in  the  short  table  9.  It  will  be  seen  that  the  greatest  contributions  to  the  x2, 
namely  13.9  and  8.0,  come  from  the  two  cells  (3  ^  X,  Y  =  0)  and  (3  ^  X,  Y  —  1), 
with  the  total  expected  number  of  individuals  6.9  as  against  the  observed  19.  How¬ 
ever,  if  these  two  cells  are  combined  with  the  two  corresponding  cells  in  the  same 
rows,  the  contributions  of  the  combined  cells  to  the  x2  become  1.2  and  0.1  respec¬ 
tively  and  the  total  x2  sinks  to  a  value  just  exceeding  the  5  per  cent  point.  Noticing 
that  the  grouping  performed  concerns  the  total  of  46  individuals  as  against  the 
sample  of  1272,  one  is  led  to  believe  that,  as  far  as  the  bulk  of  this  sample  is  con¬ 
cerned,  the  fundamental  hypothesis  is  not  seriously  wrong  and  that  the  disagree- 
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ment  noted  is  due  to  a  relatively  small  admixture  of  individuals  with  an  accident 
proneness  machinery  different  from  that  in  the  main  body  of  data. 

The  general  tentative  conclusion  is  that  cases  do  exist  in  which  (a)  the  funda¬ 
mental  hypothesis  applies  approximately  to  accidents  of  two  different  types  in¬ 
curred  during  the  same  period  of  observation  and  (b)  to  the  same  kind  of  accidents 
incurred  in  two  successive  periods  of  observation.  In  these  circumstances  it  is  plau¬ 
sible  that  the  fundamental  hypothesis  may  be  satisfied  by  two  kinds  of  accidents 
incurred  during  two  different  periods  of  observation. 

TABLE  9 

Comparison  of  Observed  and  Theoretical  Distributions  of  Incapacities 
Cause  3  vs.  Cause  4,  Group  2  (Div.  Ind.  Hyg.,  U.S.  Pub.  Health  Serv.) 


0  1  2  3  4  5  6 

Cause  A 


Keeping  in  mind  that  the  subject  of  the  present  paper  is  the  possibility  of  using 
accidents  of  one  kind  to  predict  the  number  of  accidents  of  another  kind,  it  was 
thought  useful  to  reproduce  the  regressions  of  the  number  of  accidents  of  one  kind 
on  the  actual  number  of  accidents  of  another  kind.  These  regressions  are  given  in 
figures  1  to  5.  In  each  the  straight  lines  correspond  to  the  linear  equation  (37)  of  re¬ 
gression  based  on  the  fundamental  hypothesis. 

When  inspecting  these  figures  one  should  bear  in  mind  that  regression  points 
corresponding  to  large  values  of  the  independent  variable  depend  upon  very 
moderate  numbers  of  observations.  Furthermore,  as  we  have  seen,  the  conditional 
variance  of  one  variable,  say  F,  given  a  fixed  value  of  the  other,  say  X ,  increases 
with  an  increase  in  the  value  of  X. 

It  will  be  seen  that  in  many  cases  the  fit  is  excellent.  This  is  particularly  true  for 
regressions  of  the  numbers  of  the  less  frequent  accidents  on  those  of  the  more  fre¬ 
quent  ones.  Furthermore,  the  observed  regression  points  are  generally  closer  to  the 
theoretical  line  for  small  values  of  the  independent  variable  than  for  larger  ones. 
This  circumstance  is  important  because  if  and  when  the  selection  of  personnel  is 
made  on  the  ground  of  the  number  of  accidents,  one  would  naturally  select  those 
individuals  who  in  the  past  had  few  accidents.  The  graphs  of  the  regressions  suggest 
that  the  results  of  this  kind  of  selection  will  be  in  a  reasonable  agreement  with  pre¬ 
dictions  based  on  the  fundamental  hypothesis. 


TABLE  10 

Comparison  of  Observed  and  Theoretical  Distributions  of  Accidents 

First  Year  vs.  Last  Four  Years  (Farmer  and  Chambers) 


0*0  9> 


No.  oF  accidenfs  in  Firsf-4  years 
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TABLE  11 


Comparison  of  Observed  and  Theoretical  Distributions  of  Accidents 
Fifth  year  vs.  First  Four  Years  (Farmer  and  Chambers) 


No.  oF  accidenfs  in  fiffh  year 


WHERE  X  =  NO.  OF  ACCIDENTS  IN  FIRST  YEAR 


5  2 

Y 


Fig.  1.  Regression  of  X  on  Y  and  of  Y  on  X. 


Fig.  2.  Regression  of  X  on  Y  and  of  Y  on  X.  Where  X  —  number  of  cases  of  digestive 
disease,  and  Y  =  number  of  cases  of  respiratory  disease. 


247 


Bates-N eyman:  Accident  Proneness .  I 

5.  Measures  of  success  in  selection  of  personnel  In  this  section  we  study  the 
following  question.  Suppose  that  the  fundamental  hypothesis  applies  to  certain 
types  of  light  and  of  severe  accidents.  Suppose  further  that  the  number  Y  of  light 
accidents  incurred  in  the  past  is  adopted  as  a  criterion  for  selecting  personnel  in 
order  to  diminish  the  number  X  of  severe  accidents  to  be  incurred  in  the  future. 
Specifically,  we  shall  assume  that  the  individuals  selected  for  the  particular  hazard¬ 
ous  employment  will  be  all  those  for  whom  the  number  of  light  accidents  Y  <  k 


GROUP  2 


Y 


GROUP  I 


Fig.  5.  Regression  of  X  on  Y  and  of  Y  on  X.  Where  X  -  number  of  cases  of  industrial 
injury,  and  Y  =  number  of  cases  of  nonindustrial  injury. 


and  a  certain  proportion  Q  of  those  for  whom  Y  =  k,  where  k  and  Q  are  so  adjusted 
that  the  total  number  of  individuals  selected  for  employment  represent  a  predeter¬ 
mined  proportion  P  of  available  candidates. 

In  these  circumstances,  the  interesting  question  is:  what  is  the  probability  that 
in  the  following  period  of  observation  an  individual  selected  for  employment  will 
have  no  severe  accidents  at  all?  This  probability,  say 

(58)  P{X  =  0  \P}  , 

compared  with  the  probability  P{X  =  0}  in  the  unselected  population,  appears  to 
be  a  suitable  measure  of  the  success  of  the  selection  against  severe  accidents. 
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In  order  to  obtain  P{X  =  0}  we  use  the  probability  generating  function  (39)  of 
X  and  Y  and  substitute  in  it  u  =  0  and  v  =  1.  The  result  is 

m  p|x-°>  -  (ttt)'- 

This,  then,  is  the  probability  of  no  severe  accidents  during  the  forthcoming  period 
of  observation  for  the  nonselected  population. 

In  order  to  compute  (58),  we  first  determine  k  and  Q  to  satisfy  the  conditions 
imposed.  The  probability  generating  function  of  Y  is  obtained  from  (39)  by  substi¬ 
tuting  u  =  1.  Expanding  the  result  in  powers  of  v  we  get 


(60) 


VY{m)  = 


/ _J_\  r(q  4-  m) 
\/3  +  A)  ml  r(a) 


The  number  k  is  determined  by  the  condition 


(61) 

Then 

(62) 


4-1  4 

2D  PY(m)  ^  P  <  2D  Py(wi)  • 

m= 0  tn=0 


A--1 

Q  =  P  —  ^D  PY(m)  ■ 

m= 0 


Once  k  and  Q  are  found,  then  (58)  is  computed  by  a  simple  application  of  the 
formula  of  Bayes  with  the  use  of  (40). 


P{X  =  0|P) 
(63) 


=  0)  (Y  <  7c)}  +  Q 


Pj(X  =  0)(F  =  k) } 
P{Y  =  k] 


1  (  £ _ V  v 

p  LVd+^+i/  “  m !  r(«) 


( _ A _ V,  q(j±A  Yh] 

V^+^L+l/  +  y  W4  +  1/  J  • 


Suppose  that  for  a  given  population  of  candidates  for  employment  and  for  a  given 
pair  of  kinds  of  accidents  the  values  of  a,  p  and  A  have  been  determined.  Suppose 
further  that  the  proportion  P  of  candidates  to  be  selected  for  employment  is  also 
determined.  In  order  to  estimate  the  prospective  success  of  selection  of  candidates 
we  first  compute  the  standard  of  comparison  (59)  and  then  determine  k  and  Q  to 
satisfy  (61)  and  (62).  Then  these  values  are  substituted  into  (63). 

Naturally,  the  effect  of  selection  of  candidates  depends  on  all  four  parameters 
involved,  on  a  and  f3  characterizing  the  distribution  of  A  in  the  population  of  candi¬ 
dates  for  employment,  on  the  number  A  and  on  the  proportion  P  of  those  to  be 
selected.  In  the  unselected  population  the  expectation  of  A  and  its  variance  are 


a _ E(X) 

(32  - 


(64) 


E(A)=l 


249 


Bates-N eyman:  Accident  Proneness .  I 

If  the  variance  ci  is  very  small — and  this  will  happen  when  /3  is  a  larger  number — * 
then  even  a  very  sharp  selection  will  give  practically  no  result.  In  the  cases  consid¬ 
ered  in  table  1  the  values  of  /3  are  moderate  and,  therefore,  the  prospects  for  selection 
are  promising.  Turning  to  the  other  factors  involved,  it  must  be  obvious  that  the 
smaller  P  is  the  sharper  must  be  the  selection  and,  therefore,  the  greater  its  effect. 
Finally,  the  effect  of  selection  depends  considerably  on  the  value  of  A,  which  is  the 
ratio  of  the  average  frequencies  of  light  and  of  severe  accidents, 

(65)  A  -  , 

in  the  unselected  population.  Because  of  this  interpretation  the  quotient  A  may  be 
called  the  modulus  of  the  relative  frequency  of  light  accidents. 


TABLE  12 

Corresponding  Values  of  k  and  Q  for  a  Set  of  Increasing  Values  of  the  Modulus 

of  Relative  Frequency  A 
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4 
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8 
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5 
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11 
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8 
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12 

.0282 

17 

.013 

25 

.014 
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11 

.0172 

17 

.0038 

24 
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34 
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The  actual  numbers  characterizing  the  possible  effect  of  selection  are  of  practical 
importance.  With  this  in  mind  table  12  and  figures  6  and  7  were  constructed.  They 
illustrate  two  hypothetical  situations.  In  one  of  them  the  values  of  a  =  3  and  0  =  2 
approximately  coincide  with  those  corresponding  to  the  experience  of  the  London 
bus  drivers  (see  table  1).  In  the  other  case,  a  —  3  and  (3  =  1,  so  that  both  the  expec¬ 
tation  of  A  and  its  variance  are  increased.  The  figures  are  intended  to  illustrate  the 
effect  of  selection  corresponding  to  two  different  levels  of  sharpness  of  selection.  In 
one  case  we  assume  P  —  .125  and  in  the  other  P  —  .250.  The  value  of  the  modulus 
A  varies  from  A  =  1  to  A  =  20.  For  a  succession  of  increasing  values  of  A,  table  12 
gives  the  corresponding  values  of  k  and  Q  with  which  the  proportion  of  selected 
candidates  will  be  equal  to  P.  Figures  6  and  7  give  the  corresponding  values  of 
P{X  =  0|P}.  The  horizontal  dashed  line  indicates  the  standard  of  comparison 
P{X  —  0}.  It  is  seen  that  in  both  cases,  when  A  is  small,  the  effect  of  selection  is 
already  noticeable.  When  A  is  substantial,  say  A  ^  5,  then  the  probability  of 
avoiding  severe  accidents  is  considerably  increased  by  selection.  The  practical  con- 
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Fig.  6.  Effect  of  selection  against  high  accident  proneness  (<*  =  3,  /S  =  2). 
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elusion  suggested  by  this  result  is  that,  in  order  that  the  selection  of  personnel  on 
the  basis  of  light  accidents  incurred  in  the  past  be  successful,  it  is  desirable  that  the 
average  number  of  light  accidents  during  the  period  of  observation  be  large.  This 
may  be  achieved  either  by  taking  a  long  period  of  observation  (which  may  be  im¬ 
practicable)  or  by  using  some  artifice  to  increase  the  exposure  to  light  accidents 
during  a  relatively  short  period  of  observation. 

6.  Joint  distribution  of  the  number  of  light  accidents  and  of  the  number  of  sur¬ 
vived  severe  accidents.  As  mentioned  in  the  introduction,  even  if  the  fundamental 
hypothesis  assumed  in  this  paper  is  strictly  satisfied  with  regard  to  a  category  of 
light  accidents  and  a  category  of  severe  accidents,  if  these  latter  accidents  are  really 
severe,  then  their  number  incurred  during  a  fixed  period  of  time  will  not  follow  the 
negative  binomial  distribution.  The  reason  is  that  from  time  to  time  a  severe  acci¬ 
dent,  occurring  at  the  early  part  of  the  period  of  observation,  will  prove  fatal  to  the 
individual  concerned.  As  a  result,  there  will  be  no  exposure  of  this  individual  to 
possible  further  severe  accidents  during  the  same  period  of  observation.  Thus,  if 
and  when  statistics  relating  to  light  and  to  severe  accidents  sustained  by  the  same 
individual  become  available,  then  in  order  to  be  able  to  verify  the  fundamental 
hypothesis  and  to  estimate  the  constants  involved,  a  new  type  of  distribution  will 
be  necessary.  This  must  take  into  account  the  fact  that  each  severe  accident  may 
lead  to  invalidism  or  to  death  for  the  individual  concerned.  The  purpose  of  this 
section  is  to  consider  this  distribution.  Our  basic  assumption,  supplementing  the 
fundamental  hypothesis,  will  be  that  each  individual  involved  in  a  severe  accident 
has  the  same  probability  9  of  surviving  the  accident  and  continuing  the  employment 
with  all  its  hazards.  The  alternative  to  such  survival  will  be  either  death  or  retire¬ 
ment  from  the  particular  employment.  However,  this  distinction  may  be  ignored 
and  we  shall  speak  of  two  possibilities  only :  survival  (in  good  health)  or  death  (the 
latter  meaning  either  actual  death  or  retirement). 

In  connection  with  the  change  in  the  problem,  we  shall  need  new  notation.  The 
letter  Y  will  be  used,  as  formerly,  to  denote  the  number  of  light  accidents  incurred 
by  an  individual  during  a  period  of  observation.  On  the  other  hand,  the  letter  X  will 
be  used  to  denote  the  number  of  severe  accidents  that  this  individual  will  survive , 
incurred  by  the  individual  during  the  same  or  a  different  period  of  observation. 
Thus,  if  an  individual  incurs  three  severe  accidents  and  dies  at  the  third,  then  for 
this  individual  X  =  2.  In  order  to  distinguish  between  deaths  and  survivals  we 
shall  need  a  third  random  variable  Z.  This  variable  will  be  defined  to  be  equal  to 
zero  if  the  particular  individual  survives  all  the  period  of  observation,  and  unity 
if  the  individual  does  not. 

The  statistics  of  light  and  severe  accidents  may  be  divided  into  two  categories. 
First  we  postulate  the  availability  of  the  numbers  of  light  and  of  severe  accidents 
for  those  individuals  who  survived  the  entire  period  of  observation  of  severe  acci¬ 
dents.  The  figures  obtainable  from  these  statistics  will  be  the  empirical  counterpart 
of  the  theoretical  probabilities  P { (X  =  k)  (Y  =  m)\Z  =  0} .  The  second  part  of  the 
statistics  contemplated  would  refer  to  individuals  who  died  as  a  result  of  a  severe 
accident  during  the  period  of  observation.  The  figures  obtainable  from  such  statistics 
would  correspond  to  probabilities  P{(X  ~  k)  (Y  —  m)  | Z  —  1  j.  The  formulae  for 
the  probability  generating  functions  for  these  relative  probabilities  arise  as  limiting 
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forms  of  generating  functions  deduced  under  the  general  hypotheses  considered  in 
Part  II  of  this  paper  and  are  as  follows : 

(66)  Gx,y\z=Au>v)  =  (/3  +  1  -  eu  +  .4(1  -  w))  ’ 

(67)  G  x  Ylz=l  (u,  v) 

Q3+l-g)“  1  - 8  [~  _ §* _ 1 

_  (p+i-ey-p*  i -eu  L[jS+ A(i  — v)]“  [e+i-flw+AU-v)]!  • 

It  is  seen  that  for  individuals  who  survive  the  period  of  observation  of  severe 
accidents  the  joint  distribution  of  the  number  Y  of  light  accidents  and  of  the  num¬ 
ber  X  of  survived  severe  accidents  is  again  a  bivariate  negative  binomial.  On  the 
other  hand,  for  individuals  who  die  as  a  result  of  a  severe  accident,  the  joint  distri¬ 
bution  of  X  and  Y  is  more  complicated,  with  probability  generating  function  given 
by  formula  (67). 

If  and  when  the  data  on  light  and  severe  accidents  are  available,  formulae  (66) 
and  (67)  could  be  used  to  test  the  validity  of  the  fundamental  hypothesis  assumed 
in  the  present  paper. 


REFERENCES 

[1]  M.  Greenwood  and  G.  U.  Yule,  “An  inquiry  into  the  nature  of  frequency  distributions 
representative  of  multiple  happenings  with  particular  reference  to  the  occurrence  of  multiple 
attack  of  disease  or  of  repeated  accidents/7  J .  Roy.  Stat.  Soc.,  Yol.  83  (1920),  pp.  255-279. 

[2]  E.  M.  Newbold,  A  contribution  to  the  study  of  the  human  factor  in  the  causation  of  accidents . 
Industrial  Health  Research  Board,  Report  No.  34.  London,  H.  M.  Stationery  Office,  1926. 

[3]  E.  Farmer  and  E.  G.  Chambers,  A  study  of  accident  proneness  among  motor  drivers.  Indus¬ 
trial  Health  Research  Board,  Report  No.  84.  London,  H.  M.  Stationery  Office,  1939. 

[4]  Ove  Lundberg,  On  Random  Processes  and  Their  Application  to  Sickness  and  Accident  Statis¬ 
tics.  Uppsala,  Almquist  and  Wiksells,  1940. 172pp. 

[5]  G.  Polya,  “Sur  quelques  points  de  la  theorie  des  probabilites,77  Ann.  de  VJnstitut  Henri 
Poincare ,  Vol.  1  (1930),  pp.  117-161. 

[6]  William  Feller,  “On  the  theory  of  stochastic  processes,  with  particular  reference  to  appli¬ 
cations/7  Proceedings ,  Berkeley  Symposium  on  Mathematical  Statistics  and  Probability.  Uni- 

j  versity  of  California  Press,  1949,  pp.  403-432. 

A7]  J.  Neyman,  First  Course  in  Probability  and  Statistics .  New  York,  Holt,  1950. 

/  [8]  J.  Neyman,  “Contribution  to  the  theory  of  the  x2  test/7  Proceedings,  Berkeley  Symposium 
on  Mathematical  Statistics  and  Probability .  Berkeley,  University  of  California  Press,  1949, 
pp. 239-275. 

[9]  F.  Y.  Edgeworth,  “On  the  probable  errors  of  frequency  constants/7  Journ.  Roy.  Stat.  Soc., 
Vol.  71  (1908),  pp.  662-678  (Appendix) . 

[10]  R.  A.  Fisher,  “On  the  mathematical  foundations  of  theoretical  statistics/7  Philos.  Trans. 
Roy.  Soc.,  Ser.  A,  Vol.  222  (1922),  pp.  309-368. 

[11]  Evelyn  Fix  and  Jerzy  Neyman,  “A  simple  stochastic  model  of  recovery,  relapse,  death 
and  loss  of  patients/7  Human  Biology,  Vol.  23  (1951),  pp.  205-241. 


